Learning an optimal distance metric in a linguistic vector space
نویسندگان
چکیده
Many natural language processing still depend on the Euclidean distance function between the two feature vectors, but it has severe defects as to feature weightings and feature correlations. In this paper we propose an optimal metric distance function that can be used as an alternative to the Euclidean distance, accommodating the two problems at the same time. This metric is optimal in the sense of global quadratic minimization, and can be obtained from the clusters in the training data in a supervised fashion. We confirmed the effect of the proposed metric by the sentence retrieval, document retrieval, and the K-means clustering of general vectorial data.
منابع مشابه
یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کاملGeometric Modeling of Dubins Airplane Movement and its Metric
The time-optimal trajectory for an airplane from some starting point to some final point is studied by many authors. Here, we consider the extension of robot planer motion of Dubins model in three dimensional spaces. In this model, the system has independent bounded control over both the altitude velocity and the turning rate of airplane movement in a non-obstacle space. Here, in this paper a g...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملSome properties of continuous linear operators in topological vector PN-spaces
The notion of a probabilistic metric space corresponds to thesituations when we do not know exactly the distance. Probabilistic Metric space was introduced by Karl Menger. Alsina, Schweizer and Sklar gave a general definition of probabilistic normed space based on the definition of Menger [1]. In this note we study the PN spaces which are topological vector spaces and the open mapping an...
متن کاملDistance-based human action recognition using optimized class representations
We study distance-based classification of human actions and introduce a new metric learning approach based on logistic discrimination for the determination of a low-dimensional feature space of increased discrimination power. We argue that for effective distance-based classification, both the optimal projection space and the optimal class representation should be determined. We qualitatively an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Systems and Computers in Japan
دوره 37 شماره
صفحات -
تاریخ انتشار 2006